Andrea Kaplan
March 27, 2014
A web-based tool for community detection in networks
Who cares and why did we make this?
\[ Q = \sum\limits_r (e_{rr} - a_r^2) \]
\( r = \) a community
\( e_{rr} = \) fraction of links that connect two nodes inside the community
\( a_r = \) the fraction of links that have one or both vertices inside the community
\( Q = 0 \): number of within-community edges no better than random
\( Q \in [0.3, 0.7] \): strong community structure
\( Q=1 \): maximum value
\[ \pi(S) = \frac{\#\text{ of edges within } S}{\#\text{ of edges leaving} S} \]
\[ \phi(S) = \frac{\#\text{ of edges leaving} S}{\sum\limits_{u \in S} \text{degree of } u} \] (double counting of edges within community \( S \))
\[ \begin{matrix} e_{AA} = \frac{3}{10} = 0.3 & e_{BB} = \frac{5}{10} = 0.5 \\ a_{A}^2 = \left(\frac{5}{10}\right)^2 = 0.25 & a_{B}^2 = \left(\frac{7}{10}\right)^2 = 0.49 \\ e_{AA} - a_A^2 = 0.05 & e_{BB} - a_B^2 = 0.01 \\ ~\\ Q = 0.06 & \end{matrix} \]
\[ \begin{matrix} \pi(A) = \frac{3}{2} = 1.5 & \pi(B) = \frac{5}{2} = 2.5 \end{matrix} \]
\[ \begin{matrix} \phi(A) = \frac{2}{8} = 0.25 & \phi(B) = \frac{2}{12} = 0.167 \end{matrix} \]
The main elements of the problem themselves [graph clustering], i.e. the concepts of community and partition, are not rigorously defined, and require some degree of arbitrariness and/or common sense. (Fortunato, 2010)
Meet gravicom
(1) Control Panel, (2) Data Management, (3) Connection Table, (4) Graph Display, and (5) Tabset
http://shiny1.iastate.edu/andeek/gravicom
(must be VPNed to internal ISU network)
Theory behind the curtain
What makes it tick?
GML file structure:
graph
[
directed 0
node
[
id 0
label "Node 1"
value 100
]
node
[
id 1
label "Node 2"
value 200
]
edge
[
source 1
target 0
]
]
JSON file structure:
{
"nodes":
[{"id":"n0","v_id":"0","v_label":"Node 1","v_value":"100"},
{"id":"n1","v_id":"1","v_label":"Node 2","v_value":"200"}],
"edges":
[{"source":0, "target":1}]
}
Possible extensions to gravicom
Thank you!